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ABSTRACT The Escherichia coli sequence type 131 (ST131) clone is notorious for extraintestinal infections, fluoroquinolone re- 
sistance, and extended-spectrum beta-lactamase (ESBL) production, attributable to a CTX-M-15-encoding mobile element. 
Here, we applied pulsed-field gel electrophoresis (PFGE) and whole-genome sequencing to reconstruct the evolutionary history 
of the ST131 clone. PFGE-based cluster analyses suggested that both fluoroquinolone resistance and ESBL production had been 
acquired by multiple ST131 sublineages through independent genetic events. In contrast, the more robust whole-genome- 
sequence-based phylogenomic analysis revealed that fluoroquinolone resistance was confined almost entirely to a single, rapidly 
expanding ST131 subclone, designated H30-R. Strikingly, 91% of the CTX-M-15-producing isolates also belonged to a single, 
well-defined dade nested within H30-R, which was named H30-Rx due to its more extensive resistance. Despite its tight clonal 
relationship with H30Rx, the CTX-M-15 mobile element was inserted variably in plasmid and chromosomal locations within the 
H30-Rx genome. Screening of a large collection of recent clinical E. coli isolates both confirmed the global clonal expansion of 
H30-Rx and revealed its disproportionate association with sepsis (relative risk, 7.5; P < 0.001). Together, these results suggest 
that the high prevalence of CTX-M-15 production among ST131 isolates is due primarily to the expansion of a single, highly vir- 
ulent subclone, H30-Rx. 

IMPORTANCE We applied an advanced genomic approach to study the recent evolutionary history of one of the most important 
Escherichia coli strains in circulation today. This strain, called sequence type 131 (ST131), causes multidrug-resistant bladder, 
kidney, and bloodstream infections around the world. The rising prevalence of antibiotic resistance in E. coli is making these 
infections more difficult to treat and is leading to increased mortality. Past studies suggested that many different ST131 strains 
gained resistance to extended-spectrum cephalosporins independently. In contrast, our research indicates that most extended- 
spectrum-cephalosporin-resistant ST131 strains belong to a single highly pathogenic subclone, called H30-Rx. The clonal nature 
of H30-Rx may provide opportunities for vaccine or transmission prevention-based control strategies, which could gain impor- 
tance as H30-Rx and other extraintestinal pathogenic E. coli subclones become resistant to our best antibiotics. 
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Horizontal gene transfer is one of the most powerful forces 
in bacterial evolution. The transformative potential of this 
process is perhaps best exemplified by the acquisition of anti- 
microbial resistance determinants: in a single genetic event, an 
antimicrobial-susceptible bacterium can acquire a complex suite 
of resistance determinants and become resistant to multiple anti- 
microbials. Thus, frequent horizontal transfer between different 



strains can potentially drive the spread of antibiotic resistance 
within the bacterial population, without any change in the distri- 
bution of strains. However, when virulent bacterial clones acquire 
such elements, they can emerge rapidly within the population 
through clonal expansion and thereby gain local or even global 
predominance (1-3). Quantifying the relative contribution of 
horizontal (gene transfer) and vertical (clonal expansion) mech- 
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FIG 1 PFGE dendrogram and whole-genome SNP-based phylogeny of E. coli ST131. (A) PFGE-based 
dendrogram of E. coli ST131 isolates (n = 524), as inferred within BioNumerics according to the 
unweighted pair group method based on Dice similarity coefficients. (B) Whole-genome SNP-based 
phylogeny of selected ST131 isolates (n = 105) and the NA1 14 reference genome. SNPs were identified 
from genomic regions equivalent to approximately 44.7% of the reference genome that was shared 
among all isolates and sequenced at a 10 X coverage. Analysis of these shared genomic regions revealed 
2,531 parsimony-informative and 4,000 total SNPs from the core genome (excluding horizontally 
acquired regions) that were used to construct the phylogeny presented here. Homoplasy index (HI) = 
0.012. The purple block highlights the H30 subclone. 



anisms to the emergence of multidrug- 
resistant bacterial pathogens will provide 
important insights into the evolution of 
these pathogens and inform novel inter- 
vention strategies. 

In 2008, a previously unrecognized 
Escherichia coli clonal group, sequence 
type 131 (ST131), was identified in nine 
countries, spanning three continents (4, 
5). Today, ST131 is the dominant ex- 
traintestinal pathogenic E. coli (ExPEC) 
strain worldwide, but retrospective anal- 
yses suggest that the pandemic emergence 
ofST131took place over a period of fewer 
than 10 years (6, 7). ST131 is part of the 
virulent phylogenetic group B2 and has 
been reported to cause a wide range of 
infections, including meningitis, osteo- 
myelitis, myositis, epididymo-orchitis, 
and peritonitis (6, 8-10). However, 
ST131 is most commonly associated with 
urinary tract infection (UTI) and is a ma- 
jor etiologic agent of bladder infections, 
kidney infections, and urosepsis in the 
United States and internationally (11- 
14). Population genetics analysis of 
ST131 isolates indicated that the recent 
epidemic spread of this group is driven by 
descendants of a single strain, named 
subclone H30, that differ from the mem- 
bers of other, less prevalent ST131 sub- 
clones by carriage o{fimH30, an allele of 
the gene encoding the mannose-specific 
type 1 fimbrial adhesin, FimH (15). 

Over the last decade, the emergence 
of multidrug-resistant ExPEC strains has 
made UTI treatment more problematic, 
leading to discordant antimicrobial ther- 
apy and increased morbidity and mortal- 
ity (16-19). This increase in multidrug- 
resistant UTIs has in large part been 
due to the rapid rise in prevalence of 
ExPEC strains — particularly from 
ST131 — harboring determinants for 
extended-spectrum jS-lactamases (ESBLs) 
and resistance to trimethoprim- 
sulfamethoxazole and fluoroquinolones 
(FQ) (16, 20-27). 

The CTX-M-15 j8-lactamase is the 
dominant ESBL in ST131 and is increas- 
ingly found in isolates causing both UTI 
and bacteremia (13, 28-31). The CTX-M 
gene phylogeny suggests that these en- 
zymes arose through mobilization of 
chromosomal j8-lactamase {hid) genes 
from the gut commensal organism Kluy- 
vera (32). A number of CTX-M enzymes 
have risen to prevalence since the 1990s, 
with a new CTX-M type frequently ap- 
pearing in multiple distant countries si- 
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FIG 2 High- resolution phylogenetic analysis of the emergence of fluoroquinolone resistance and CTX-M-15 production. Approximately 5 1 .8% of the reference 
genome was shared among all isolates and sequenced at a 10X coverage. Analysis of these shared genomic regions revealed 72 parsimony-informative SNPs and 
771 total SNPs from the core genome (excluding horizontally acquired regions) that were used to construct the phylogeny presented here. Homoplasy index (HI) 
= 0.000. The colored blocks highlight the three nested ST131 subclones, H30 (purple), H30-R (blue), H30-Rx (yellow). 



multaneously, suggesting independent transfer events (33). This, 
together with the substantial diversity in transferable resistance 
elements in ST 1 3 1 , has led some to conclude that horizontal trans- 
fer must be the dominant mechanism whereby ESBLs have gained 
prominence among strains of the ST131 clone (7, 34, 35). How- 



ever, other evidence suggests that clonal expansion contributes 
significantly to the spread of antimicrobial resistance within E. coli 
(36-39). 

Until recently, our knowledge of the epidemiology and disper- 
sal of bacterial strains, including of ST131 origin, has been based 
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largely on multilocus sequence typing (MLST), which has limited 
resolution at the subclone level, and on pulsed-field gel electro- 
phoresis (PFGE) analysis, which is highly vulnerable to distortions 
from horizontal gene transfer events and subjective interpreta- 
tion. In the current study, we used whole-genome single- 
nucleotide polymorphism (SNP) analysis to reconstruct the 
ST131 phylogeny and then overlaid resistance determinants and 
phenotypic susceptibility on this phylogeny to elucidate the evo- 
lutionary history of fluoroquinolone resistance and ESBL produc- 
tion within this prominent pathogen. 

RESULTS 

PFGE analyses. A collection of 524 ST131 isolates cultured from 
humans and animals between 1967 and 2011 was analyzed using 
PFGE, which yielded a complex dendrogram (Fig. 1 A). Within the 
PFGE-based dendrogram, the isolates that were fluoroquinolone 
resistant and/or bla CTX _ M _ 15 positive, although exhibiting some 
clustering, were intermingled extensively with those that were flu- 
oroquinolone susceptible and/or bla crx _ M _ l5 negative. As such, 
the PFGE analysis supported previous reports suggesting frequent 
horizontal acquisition of fluoroquinolone resistance determi- 
nants and bla crx _ M _ 15 among different ST131 subclones. 

Whole-genome SNP-based phylogenetic reconstruction of 
ST131. From the total collection that underwent PFGE analysis, 
105 ST131 isolates were systematically selected for genome se- 
quencing according to prespecified criteria that emphasized diver- 
sity of genetic backgrounds according to PFGE. The 105 isolates, 
which derived from five countries and 23 states and provinces in 
Canada and the United States, included 22 CTX-M-15-producing 
isolates, which were widely distributed across the PFGE dendro- 
gram (Fig. 1A). 

Genomic comparisons identified SNP loci that were present in 
all isolates and, therefore, informative for phylogenetic recon- 
struction. The first phylogenetic tree included non-ST131 strain 
AA86 (group B2; ST1876) (40) as an outgroup, to root the tree and 
to identify the basal clones within the ST131 phylogeny (see Fig. SI 
in the supplemental material). Next, strain AA86 was excluded, 
and a new SNP matrix and phylogenetic tree were generated (see 
Fig. S2 in the supplemental material). Since (distant) strain AA86 
lacks some of the genomic regions found within the ST131 clone, 
exclusion of AA86 increased the number of shared genomic re- 
gions in the sequence alignment and, therefore, increased the 
number of informative SNPs with which to resolve the ST131 
phylogeny. 

The homoplasy index (HI) for these two initial trees (see 
Fig. SI and S2) was exceedingly high (>0.33), indicating substan- 
tial recombination. Phylogenetic reconstructions that include 
genomic regions acquired by horizontal gene transfer will not ac- 
curately represent the evolutionary history of clonal organisms. 
However, such phylogenies can be used to identify the regions 
acquired horizontally. This was accomplished here by mapping to 
the reference genome the HI values for individual SNPs, which 
revealed four large recombinant regions representing nearly 31% 
of the genome. 

Exclusion of SNPs from the four horizontally acquired regions 
resulted in trees with minimal homoplasy (homoplasy index [HI] 
= 0.012) (see Fig. S3 in the supplemental material), suggestive of 
highly accurate phylogenies (41). Figure IB shows the resultant 
whole-genome SNP phylogeny for the 105 ST131 isolates, plus the 
strain NA114 reference ST131 genome (42). 



Whole-genome-based clustering of resistant subclones. The 

whole-genome SNP-based phylogeny showed distinct clustering 
of strains carrying specific fimH alleles (Fig. IB), as well as gyrA 
and parC alleles and O type (see Dataset SI and Table SI in the 
supplemental material). In particular, strains carrying thefimH30 
allele clustered as a single low-diversity clade, designated H30, 
which included 58 (95%) of the 61 fluoroquinolone-resistant iso- 
lates. Moreover, nearly all of the CTX-M-15-producing isolates, 
despite appearing to have diverse genetic backgrounds according 
to the PFGE-based dendrogram (Fig. 1 A), collapsed into a distinct 
subclade within the H30 clade (Fig. IB). 

To further resolve the evolutionary history of the H30 sub- 
clone, genomic sequences from the 64 H30 isolates and their three 
nearest neighbors were analyzed separately from the rest of the 
isolates (Fig. 2). Aligning these sequences to the finished NA114 
reference genome increased the number of shared nucleotides and 
revealed additional informative SNPs that were used to generate 
the high-resolution and highly accurate (HI = 0.000) phyloge- 
netic tree shown in Fig. 2. This tree suggested that acquisition of 
thefimH30 allele preceded the acquisition of fluoroquinolone re- 
sistance by a single ancestor within the H30 subclone, which was 
followed by a large clonal expansion of fluoroquinolone-resistant 
H30 strains. To distinguish the clonally related fluoroquinolone- 
resistant H30 isolates from the ancestral fluoroquinolone- 
susceptible H30 isolates, the resistant subclone within H30 was 
designated H30-R. 

In this high-resolution phylogeny, 20 (91%) of the 22 ST131 
isolates that carried bla CTX _ M _ l5 — including isolates from Austra- 
lia, South Korea, Portugal, Canada, and the United States — 
formed a distinct, single-ancestor subclone within H30-R. Be- 
cause of its more extensive resistance characteristics, this bla crx _ 
M-15-associated subclone was designated H30-Rx (Fig. 2). Three 
canonical SNPs distinguished H30-Rx from the rest of H30-R with 
100% fidelity. 

Genomic location of the CTX-M-15 element. Multiple previ- 
ous studies have reported that bla crx _ M _ 15 is positioned on a con- 
jugative IncFII-type plasmid as part of a Tn3-like ISEcpl-bla crx _ 
M-i5-orf477 mobile element. Here, we performed an in silico 
analysis to characterize the structure and genomic location of the 
CTX-M-15 mobile element among the 22 Wa CTX _ M _ 15 -positive 
ST131 isolates. In each instance, W<Jctx-m-i5 was P art °f a typical 
Tn3-like ISEcpl-bla crx _ M _ 15 -or{477 transposable element. While 
no SNPs were identified among the different CTX-M- 1 5 elements, 
the regions flanking bla CTX _ M _ 15 were frequently degraded by in- 
sertion/deletion (not shown). The Illumina short-read sequences 
were sufficient to reliably identify the element's insertion site for 
all but three of the 22 CTX-M- 15-positive isolates sequenced in 
the current study (Table 1). The insertion site varied among the 
isolates: 13 carried a single copy on an IncFII-type plasmid, four 
carried a single copy on the chromosome, and two carried one 
copy on the chromosome and another copy on an IncFII-type 
plasmid (Table 1). Moreover, among the strains with a chromo- 
somally located element, five distinct chromosomal insertion sites 
were identified; only two strains, JJ1886 and JJ1887, which are the 
nearest neighbors in the SNP phylogeny (Fig. 2), carried the ele- 
ment in the same chromosomal location (Table 1). 

Association of ESBL production and bla crx _ M _ 15 with the 
£T30-Rx ST131 subclone. Because most of the isolates in the phy- 
logenetic trees were of historic (i.e., pre-2009) origin, we assessed 
the generalizability of the observed association of the H30-Rx sub- 
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TABLE 1 CTX-M-15 element locations among the 22 Escherichia coli 
ST131 isolates 



Isolate 






Chromosomal 


name 


Subclone 


Genomic location" 


insertion site fo 
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NA 
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Non-H30 


Plasmid 
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rt Marked as undetermined if the in silico analyses provided equivocal results. 
h Chromosomal location based on the 111886 closed genome. NA, not applicable. 
c Reported previously. 



clone with ESBL production and bla CTX _ M _ 15 by analyzing more 
recent clinical isolates, i.e., from 2011 to 2013. For this, a total of 
261 ST131 isolates from Seattle, WA, Minneapolis, MN, and 
Minister, Germany, were assessed for fimH allele type, H30-Rx 
subclone membership, ESBL production, and possession of 
w «ctx-m-i5 (Table 2). 

Among the 261 recent ST131 isolates, 174 (67%) belonged to 
the H30 subclone, whereas the remaining 87 (33%) carried one of 
several other ST131 -associated fimH alleles, as described recently 
(15). Among the 174 H30 isolates, the 163 (94%) that were fluo- 
roquinolone resistant were defined as H30-R. Detection of H30- 
Rx-specific SNPs showed that H30-Rx comprised 44 (27%) of 163 
H30-R strains (Table 2). 

Among the 44 fBO-Rx isolates, 34 (77%) were ESBL produc- 
ing, and 33 of these carried bla crx _ M _ 15 . This was very similar to 



the 74% carriage of bla crx _ M _ 15 observed among the genome- 
sequenced historic H30-Rx isolates (Fig. 2) but significantly 
higher than the low prevalence of either ESBL production or 
bla CTX _ M _ 15 carriage observed among the recent non-H30 ST131 
isolates (3% for each trait), H30 but not H30-R isolates (9% for 
each trait), and H30-R but not H30-Rx isolates (6% and 2% for 
the two traits, respectively) (Table 2). Thus, Wa C Tx-M-i5 accounted 
for nearly all ESBL-producing isolates within the H30-Rx sub- 
clone; conversely, within ST131 overall, the H30-Rx subclone ac- 
counted for the vast majority of ESBL production and, especially, 
Wa CTX -M-i5 carriage. Moreover, this tight association between 
H30-Rx and bla crx _ M _ 15 held true across the different laboratories 
that supplied the recent clinical isolates (data not shown). 

Demographic, geographic, and clinical prevalence of H30- 
Rx. We also assessed the relative prevalence of the H30-R and 
H30-Rx subclones within the total ST131 population in relation to 
patient population and locale by comparing urine isolates from 
Group Health Cooperative in Seattle, WA, which serves an almost 
exclusively outpatient population, with urine isolates from hospi- 
tal laboratories in the United States and Germany that serve mixed 
inpatient and outpatient populations. The relative prevalence of 
H30-Rx was highest among the German Hospital isolates (where 
it exceeded the prevalence even of other H30-R isolates), interme- 
diate among United States-based hospital isolates, and lowest 
among the Group Health outpatient isolates (Table 3). 

Data regarding presence/absence of clinically diagnosed sepsis 
were available for 162 of the recent United States ST131 clinical 
isolates, among which 12 source patients (7%) overall were diag- 
nosed with sepsis (Table 4), a value similar to the 5.2% overall 
prevalence of diagnosed sepsis among the 1,133 extraintestinal 
clinical isolates from which the 162 ST131 strains were derived (P 
= 0.26) (15). However, sepsis was diagnosed in 28% of the pa- 
tients with an H30-Rx isolate (Table 4), a significantly greater 
proportion than among patients with a non-H30-Rx, H30-R iso- 
late (6%; P = 0.02), a non-H30, ST131 isolate (4%; P = 0.01), any 
non-H30-Rx, ST131 isolate (5%; P = 0.005), or a non-ST131 iso- 
late (5.6%; P = 0.003). For H30-Rx isolates compared to other 
ST131 isolates, the relative risk of associated sepsis was 7.5 (95% 
confidence interval, 2.3 to 23.8). 

DISCUSSION 

The results of this study provide compelling evidence that clonal 
expansion is the dominant mechanism for the proliferation of 



TABLE 2 Association of ST 1 3 1 subclones with resistance traits among 261 recent clinical isolates of Escherichia coli ST 1 3 1 from the United States 
and Germany 

Prevalence of resistance trait, no. (%) 

ST131 subclone(s) 

H30 (n = 174) 

H30-R (n = 165) 

Total STB 1 strains Non-H30 Non-H30-R Non-H30-Rx H30-Rx 

Resistance trait (n = 261) {n = 87) (n = 11) (fi = 119) (n = 44) 

FQ resistant 163 (62) 0 (0) 0 (0) 119(100) 44 (100)" 

ESBL 45 (17) 3 (3) 1 (9) 7 (6) 34 (77) fc 

frfacxx-M-15 39(15) 3J7) W) 2J2) 33 (75)' 

" For the fluoroquinolone (FQ) -resistant fraction, H30-Rx compared to other ST131 isolates, P < 0.001 (Fisher's exact test [FET]). 
b For prevalence of extended- spectrum /3-lactamase (ESBL) production, H30-Rx compared to other ST131 isolates, P < 0.001 (FET). 
c For prevalence of bla CTX _ M _ 15 , H30-Rx compared to other ST131 isolates, P < 0.001 (FET). 
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TABLE 3 Prevalence of ST131 subclones in relation to source population among 261 recent clinical isolates of Escherichia coli ST131 from the 
United States and Germany 

ST131 subclones, no. (%) 

H30 (n = 174) 

H30-R (n = 165) 

Total no. of Non-H30 Non-H30-R Non-H30-Rx H30-Rx 

Source population ST1 31 isolates (n = 87) (ft = 11) (n = 119) {n = 44) 

United States ambulatory 86 35 (41) 3 (4) 42 (49) 6 (7)** 

United States hospital 120 32 (27) 4 (3) 64 (53) 20 (17)« 

German hospital 55 20 (36) 4 (7) 13 (24) 18 (33) b - c 

a For prevalence of ff30-Rx, United States ambulatory compared to United States hospital isolates, P = 0.054 (FET). 
b For prevalence of i-BO-Rx, United States ambulatory compared to German hospital isolates, P < 0.001 (FET). 
c For prevalence of H30-Rx, United States hospital compared to German hospital isolates, P = 0.03 (FET). 



both CTX-M-15 production and fluoroquinolone resistance in 
E. coli ST131. Past studies have shown that the determinants for 
both of these traits can be acquired through horizontal gene trans- 
fer or, in the case of fluoroquinolone resistance, independent mu- 
tation (15). However, the whole-genome SNP-based phylogenies 
presented here show that almost all of the fluoroquinolone- 
resistant ST131 isolates belong to a distinct subclone, H30-R, 
which was derived from a single common ancestor carrying the 
fimH30 allele (i.e., part ofthefflO subclone). Likewise, 91% of the 
CTX-M-15-producing isolates form another distinct subclone, 
H30-Rx, which was derived from a single common ancestor 
within the H30-R subclone. These nested subclones form a 
Russian-doll-like configuration, within which each subsequent 
lineage is more extensively resistant than the former. 

The nearly exclusive confinement of the CTX-M- 1 5 element to 
the H30-Rx subclone was in striking contrast to this element's 
promiscuity within the ST131 genome. We identified H30-Rx iso- 
lates with a copy of the element on the chromosome, on an IncFII- 
type plasmid, and, in some instances, on both the chromosome 
and a plasmid. Moreover, among those isolates with a chromo- 
somal CTX-M- 1 5 element, the element was inserted in five differ- 
ent locations. Notably, the only two isolates with identical chro- 
mosomal insertion sites were recovered from epidemiologically 
linked adult siblings who both suffered from UTIs of varying se- 
verity and were suspected of sharing the same ST 13 1 strain (43), as 
supported here by the close proximity of these isolates in the high- 
resolution H30 phylogenetic tree (Fig. 2). Among the 22 CTX-M- 
15-producing strains, the CTX-M-15 elements exhibited no SNPs 



and, therefore, no phylogenetic signal that could be compared 
with the host strain phylogeny. 

One cannot exclude the possibility that chromosomal and even 
some plasmid-borne CTX-M- 1 5 elements were acquired horizon- 
tally by H30-RX on multiple occasions. However, three lines of 
evidence suggest that within H30-Rx, the chromosomally en- 
coded CTX-M-15 is the result of repeated intragenomic mobili- 
zation of the plasmid-located TnJ-like ISEcpl-bla crx _ M _ 15 -or{477 
element, rather than independent horizontal acquisition events. 
First, the sequence identity of the CTX-M-15 elements suggests 
that they represent copies of the same recently derived genetic 
element. Second, the plasmid location is the most common 
among H30-Rx isolates. Third, some strains with chromosomally 
encoded CTX-M-15 (e.g., JJ1886 and JJ1887) also maintain an 
IncFII-type plasmid, albeit missing the CTX-M-15 element, con- 
sistent with the CTX-M-15 element having moved from the plas- 
mid to the chromosome. 

The exact evolutionary history of CTX-M-15 acquisition in 
H30-Rx also remains to be elucidated. It is possible that H30-Rx 
was founded by an ancestor that acquired CTX-M-15 on an incF- 
type plasmid, which expanded and differentiated along with the 
subclone. Under this model, the CTX-M- 15-negative H30-Rx iso- 
lates would represent independent gene loss events. An alternative 
explanation is that the H30-Rx subclone was founded by a CTX- 
M-15 -negative ancestor and partially expanded before becoming 
extensively cephalosporin resistant later, through horizontal ac- 
quisition of the CTX-M- 1 5 element. Indeed, this subclone may be 
under differential third-generation cephalosporin selection due to 



TABLE 4 Association of ST131 subclones with clinical sepsis among 162 recent clinical isolates of Escherichia coli ST131 from the United States 

No. (%) of isolates with associated clinical presentation 



Clinical presentation 



ST131 subclone(s) 



H30 (n = 174) 



Total ST131 
isolates 
[n = 162) 



Non-H30 
(n = 56) 



Non-H30-R 
(n = 6) 



H30-R (n = 165) 



Non-H30-Rx 
(n = 82) 



H30-Rx 
(« = 18) 



No sepsis 
Sepsis 



150 (92.6) 
12 (7.4) 



54 (96) 
2(4)" 



6 (100) 
0(0)* 



77 (94) 
5(6)' 



13 (72) 
5 (28)»-<«. rf 



a For prevalence of sepsis, ff30-Rx compared to non-H30, P = 0.008 (FET). 

b For prevalence of sepsis, H30-Rx compared to non-H30-R (H30), P = 0.28 (FET). 

e For prevalence of sepsis, H30-Rx compared to non-H30-Rx (H30-R), P = 0.016 (FET). 

<* For prevalence of sepsis, iB0-Rx compared to all other ST131 (7/144, 5%),P= 0.005 (FET). 
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its enhanced virulence, which might result in its being exposed to 
aggressive antimicrobial therapy more commonly than other 
E. coli strains. Further investigation, including total chromosome 
and plasmid closure, could clarify the most probable mechanism 
for the association between H30-Rx and CTX-M-15. Interest- 
ingly, however, the Wa cxx _ M _ 15 -positive, FQ-resistant JJ2244 
strain, which occupies a phylogenetic outgroup position relative 
to both H30-R and H30-Rx (see Fig. 2), likely represents an inde- 
pendently emerged multidrug-resistant clonal lineage within the 
H30 subclone. Indeed, this strain not only lacks all canonical 
H30-Rx SNPs but also has a distinct, recombinant FQ resistance- 
conferring gyrA-parC allele combination (i.e., 1AB/4A), com- 
pared with that present in H30-R (i.e., lAB/laAB). 

The results of this whole-genome SNP-based analysis depicted 
a considerably different evolutionary history for ST131 from that 
derived from PFGE analysis. Our use of an iterative approach to 
identify and exclude SNPs from recombinant regions elucidated 
an evolutionary path marked by clonal expansions rather than 
frequent lateral gene acquisitions. This underscores one of the 
major advantages of a whole-genome SNP-based approach rela- 
tive to PFGE. Although PFGE likewise uses signatures from 
throughout the genome, it is highly vulnerable to phylogenetic 
distortions from horizontal gene transfer, which can lead to false 
assumptions about the evolutionary history of an organism (44), 
and from subjective interpretation of banding patterns and the 
(often invalid) presumption that similarly migrating bands repre- 
sent the same chromosomal region. 

The biological basis for the proliferation of H30-R and H30-Rx 
remains unclear. It is possible that antimicrobial resistance, the 
seemingly obvious explanation, is not the sole selective character- 
istic leading to the successive proliferation of H30-R and H30-Rx 
from within the H30 subclone. This is suggested by the fact that 
certain non-H30 ST131 isolates were identified that possessed the 
same phenotypic resistance traits without having reached a com- 
parable level of success as H30-Rx. Increased virulence, as sug- 
gested by the significant association between H30-Rx and sepsis, 
could be a factor contributing to the success of this important 
subclone. Further investigations, including detailed comparative 
genomic, epidemiological, and functional studies, are needed to 
determine the basis for the success of H30-R and the strong asso- 
ciation of H30-Rx with sepsis. 

Regardless of mechanisms, the association of H30-Rx with sep- 
sis, its broad multidrug resistance profile, and its rapid expansion 
and geographic dispersal warrant attention from the public health 
and clinical communities. Although continued accumulation of 
antibiotic resistance determinants may limit therapeutic options 
in the future, the clonal nature of H30-Rx may facilitate effective 
control strategies involving vaccines or transmission prevention. 

MATERIALS AND METHODS 

Isolates and patients. The molecular epidemiological analyses used a 
large collection (n = 1,908) of recent, consecutive, single-patient E. coli 
isolates from 6 clinical microbiology laboratories in the United States and 
Germany. The United States isolates (n = 1,518) were recovered in 2010 
and 201 1 from 5 locations, including Group Health Cooperative, Harbor- 
view Medical Center, Seattle Children's Hospital, and University of Wash- 
ington Medical Center (all in Seattle, WA) and the Veterans Affairs Med- 
ical Center in Minneapolis, MN, as described previously (15). The 
German isolates (n = 390) were recovered in 2012 at the University Hos- 
pital in Minister, Germany. All isolates underwent fumC-fimH (CH) 
clonotyping (45) to identify ST131 and its constituent CH clonotypes (i.e., 



fimH-specific subclones, including H30) and were assessed for ESBL pro- 
duction by disk diffusion as specified by the Clinical and Laboratoiy Stan- 
dards institute. Medical record data regarding presence of clinically diag- 
nosed sepsis at the time of sample collection or during the subsequent 
30 days were available for 1,133 (75%) of the 2010-2011 United States 
isolates. Each center's institutional review board approved the study pro- 
tocol. 

PFGE analysis. The 524 historical and recent ST131 isolates were sub- 
jected to standardized Xbal PFGE analysis, as described previously (46). 
The dendrogram was inferred within BioNumerics (Applied Maths) ac- 
cording to the unweighted pair group method based on Dice similarity 
coefficients. 

Strain selection. Selection of ST131 isolates for genome sequencing 
was done in successive phases. First, to sample the breadth of phylogenetic 
diversity within the ST (to the extent that this is reflected in PFGE pro- 
files), 20 isolates were selected to represent widely distributed clusters 
within a PFGE profile dendrogram based on a published collection of 524 
historical and recent ST 1 3 1 isolates from diverse locales, years of isolation, 
and hosts (Fig. 1A). In selecting the representative isolate(s) for a given 
PFGE cluster, priority was given to (i) most recent year of isolation, (ii) 
human host, and (iii) fluoroquinolone resistance. Next, 28 additional 
isolates were selected from these same PFGE clusters based on (i) prox- 
imity in the dendrogram to the initially selected (index) isolate and (ii) 
differences from the index isolate with respect to host and/or fluoroquin- 
olone phenotype. Subsequently, an additional 60 isolates were selected 
from both this initial collection and a large collection of recent human 
clinical ST131 isolates from Seattle, WA, and Minneapolis, MN, that had 
undergone sequence analysis of gyrA, parC, and fimH (to define subclones 
within ST131) and PFGE analysis. Here, selection criteria included (i) 
distinctive gyrA, parC, and/ 'or fimH alleles, or combinations thereof, (ii) 
outliers with respect to fluoroquinolone phenotype, in comparison with 
other isolates sharing the same PFGE type or gyrA-parC-fimH allele com- 
bination, and (iii) distinctive host species, clinical presentations (e.g., 
published case report isolates), specimen types (e.g., food or environmen- 
tal), or dates of isolation (e.g., oldest known and oldest published ST131 
isolates). Of the 108 total selected isolates, four isolates were subsequently 
excluded due to questionable authenticity, leaving 104 isolates for genome 
sequencing. 

Genome sequencing. DNA samples were prepared for multiplexed, 
paired-end sequencing on an Illumina Genome Analyzer IIx (Illumina, 
Inc., San Diego, CA). For each isolate, 1 to 5 jug DNA in 200 fjl was sheared 
in a 96-well plate with the SonicMAN (part no. SCM1000-3; Matrical 
Bioscience, Spokane, WA) to a size range of 200 to 1,000 bp, with the 
majority of material at ca. 600 bp, using the following parameters: 
prechill, 0°C for 75 s; cycles, 20; sonication, 10 s; power, 100%; lid chill, 
0°C for 75 s; plate chill, 0°C for 10 s; postchill, 0°C for 75 s. The sheared 
DNA was purified using the QIAquick PCR purification kit (catalogue no. 
28106; Qiagen, Valencia, CA). The enzymatic processing (end repair, 
phosphorylation, A tailing, and adaptor ligation) of the DNA followed the 
guidelines described in the Illumina protocol ("Preparing Samples for 
Multiplexed Paired-End Sequencing," catalogue no. PE-930-1002, part 
no. 1005361). The enzymes for processing were obtained from New Eng- 
land Biolabs (catalogue no. E6000L; New England Biolabs, Ipswich, MA), 
and the oligonucleotides and adaptors were obtained from Illumina (cat- 
alogue no. PE-400-1001). 

After ligation of the adaptors, the DNA was run on a 2% agarose gel for 
2 h, after which a gel slice containing 500- to 600-bp fragments of each 
DNA sample was isolated and purified using the QIAquick gel extraction 
kit (catalogue no. 28706; Qiagen, Valencia, CA). Individual libraries were 
quantified by quantitative PCR on an ABI 7900HT (part no. 4329001; Life 
Technologies Corporation, Carlsbad, CA) in triplicate at two concentra- 
tions, 1:1,000 and 1:2,000, using the Kapa library quantification kit (part 
no. KK4832 or KK4835; Kapa Biosystems, Woburn, MA). Based on the 
individual library concentrations, equimolar pools of no more than 12 
indexed E. coli libraries were prepared at a concentration of at least 1 nM 
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using 10 mM Tris-HCl (pH 8.0) and 0.05% Tween 20. To ensure accurate 
loading onto the flow cell, the same quantification method was used to 
quantify the final pools. The pooled paired-end libraries were sequenced 
on an Illumina Genome Analyzer IIx to a read length of at least 76 bp. 

The genomes were sequenced at an average depth of 60.93 X (standard 
deviation [SD] = 31.66, using the 4,971,461-baseNA114 chromosome as 
a reference). An average of 4,654,457.54 bases (SD = 385,629.23) were 
sequenced at S10X coverage. 

Identification of SNPs. Illumina whole-genome sequence data sets 
were aligned against the chromosome of a published ST131 reference 
genome (strain NA1 14; GenBank accession no. CP002797) (42) using the 
short-read alignment component of the Burrows-Wheeler Aligner. Each 
alignment was analyzed for SNPs using SolSNP, a Java-based DNA 
variant-calling tool for next-generation sequencing alignment data. Sol- 
SNP uses a modified Kolmogorov-Smirnov statistic and data filtering to 
call variants on high-coverage, aligned genomes (http://sourceforge.net 
/projects/solsnp/). To avoid false calls due to sequencing errors, SNP loci 
were excluded if they did not meet a minimum coverage of 1 0 X and if the 
variant was present in <90% of the base calls for that position. SNP calls 
were combined for all of the sequenced genomes such that, for the locus to 
be included in the final SNP matrix, it had to be present in all of the 
genomes. SNPs falling in the duplicated regions on the reference genome 
were discarded. 

Phylogenetic analysis. Phylogenetic trees were generated using the 
maximum-parsimony method in PAUP version 4.0M0. Using prior 
knowledge about near neighbors, a published E. coli strain belonging to 
the phylogenetic group B2 genome (strain AA86; GenBank accession no. 
AFET00000000) was selected as an outgroup to root the ST131 whole- 
genome sequence tree (45). ST131 isolates in the clade nearest to this 
bifurcation point were used to root subsequent trees. 

The homoplasy index (HI) was calculated in PAUP using the formula 
HI = 1 — CI, where CI is the consistency index. CI serves to measure the 
relative amount of homoplasy in a cladogram, as assessed by the level of 
difficulty in fitting SNP alleles to a given tree. The CI is calculated using the 
formula CI = mis, where m is the total number of expected character 
changes and s is the actual number of changes that occur in the tree. 

Detection of H30-Rx-specific SNPs. Two SNPs that differentiate the 
CTX-M-15-associated subclone (H30-Rx) within the H30-R subclone 
from the rest of the H30 subclone were interrogated using Sanger se- 
quencing. SNP-200 was detected as a C-to-T transition at position 299 of 
the 460-bp PCR product generated using forward primer 5' GACACCA 
TGCGTTTTGCTTC 3' and reverse primer 5' TCGTACCGGCAACAAT 
TGAC 3' . SNP-264 was detected as a G-to-A transition at position 287 of 
the 462-bp PCR product generated using forward primer 5' GTGGCGA 
TTTCACGCTGTTA 3' and reverse primer 5' TATCCAGCACGTTCCA 
GGTG 3'. Isolates that tested positive for both SNPs were regarded as 
members of the H30-Rx subclone. 

PCR-based detection of bla CTX _ M _ 15 . The CTX-M-15-encoding gene 
Wa CTX _ M , 15 was detected by PCR using SNP-specific forward primer 5' A 
TAAAACCGGCAGCGGTGG 3' and universal reverse primer 5' GAATT 
TTGACGATCGGGG 3' (47). PCR conditions were 10 min of denatur- 
ation at 95°C, 33 cycles of 30 s at 94°C, and 30 s at 67°C, followed by 7 min 
at 72°C elongation. The Wfl CTX _ M _ 15 -specific 483-bp PCR product was 
detected by agarose gel electrophoresis 

Location of CTX-M-15-encoding mobile element. Illumina short 
reads were aligned to the closed chromosome of JJ1886 (CP006784, 
CP006785, CP006786, CP006787, CP006788, and CP006789) (54) and 
the pEC_L8 closed plasmid sequence using BWA-MEM (48) version 
0.7.5a-r405 with the default settings for paired reads. Alignments were 
analyzed using the IGV tool (49, 50) . The location of the conserved CTX- 
M-15 element (ISEcpl-bla CTX _ M _ 15 -ovf477) was then inferred based on 
the sequence alignments. Close attention was paid to the boundaries of 
the element, since any unmapped or incorrectly mapped pairs or pairs 
with insertions or deletions provided clues as to the genomic location and 
possible rearrangements. Putative duplications were noted if the depth of 



coverage around the element was substantially higher than the flanking 
regions. Illumina short reads were also assembled with the Mira assembler 
(51). The contigs were aligned with Mauve (52) against JJ1886 and 
pEC_L8 to identify any chromosomal rearrangement in the contigs car- 
rying the CTX-M-15 element. 

fitnH and gyrA-parC allele assignments. Sequenced isolates were as- 
sembled using VelvetOptimiser (version 2.2.2) and Velvet (53). AllfimH, 
gyrA, and parC sequences were compared to an in-house sequence library 
using nucleotide-nucleotide BLAST (version 2.2.25 + ) . Sequence similar- 
ity matches were determined using thresholds of 1 00% nucleotide identity 
and 100% coverage of the query sequence length. Allele designations were 
assigned based on an in-house nomenclature for the gyrA-parC combina- 
tion and fimH. 

Statistical methods. Comparisons of proportions were tested using 
Fisher's exact test or a chi-square test (two tailed), with P values of <0.05 
as the criterion for significance. 

Accession number. All Illumina sequences were deposited into the 
NCBI SRA (http://www.ncbi.nlm.nih.gov/sra), study accession number 
SRP027327. 
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